training data bias
Query use case
Is there evidence that training data on which this system has (or is free of) bias?
Schemas used
Pseudo code
FUNCTION ai_system_has_data_bias(AI_System_ID)
CREATE empty list Attestations
// Step 1: Retrieve dataset verification credentials linked to the AI system
SET Data_VC_IDs = get dataset verification credentials associated with AI_System_ID
// Step 2: Identify bias attestations in each dataset verification credential
FOR EACH Data_VC_ID in Data_VC_IDs DO
SET Attestations_List = get bias attestations linked to Data_VC_ID
FOR EACH Attestation in Attestations_List DO
IF Attestation is of type "bias" THEN
SET Component_Hash = Attestation's component hash
SET Bias_Details = Attestation's details
ADD ({"component": Component_Hash, "data_vc_id": Data_VC_ID}, Bias_Details) TO Attestations
// Step 3: Return all bias attestations found
RETURN Attestations
END FUNCTION
Explanation
-
Find relevant data sources:
- Retrieve the configuration verification credential (
ConfigVcId) for the AI system. - Extract the weights verification credential (
WeightsVcId) used in training. - Ensure that the
WeightsVcIdis classified as"Weights". - Trace back to the training system that produced these weights.
- Identify the datapack used in the training process.
- Retrieve the configuration verification credential (
-
Extract the list of Data Verification Credentials (
DataVcIds) used in training from the datapack. -
Identify attestations that indicate bias:
- For each
DataVcId, retrieve its bias attestations. - If an attestation is labeled as
"bias", extract itscomponent_hashandBiasDetails.
- For each
-
Return a list of bias attestations:
- Each entry consists of a tuple:
- Component information (
component hashandDataVcId). - Bias details describing the detected bias.
- Component information (
- Each entry consists of a tuple:
Query
ai_system_has_data_bias(AiSystemId, Attestations)link to query- link to simulator
Notes
This assumes we have a trusted method to identify bias on a data set.